5  Archaeobotany

Page under construction

The results presented here are preliminary and the chapter has yet to be written.

In this chapter, I will present the macrobotanical data from 170 case studies used to carry on this research (Chapter 3), along with the quantifications performed on the absolute counts. The data will be first presented temporally, and a discussion of the diachronic trends will follow at the end of the chapter.

5.1 Case studies

The following map shows the sites under investigation, divided by chronology. Please select the desired chronology (or chronologies) from the legend on the right.

Figure 5.1: Legend: R = Roman, LR = Late Roman, EMA = Early Middle Ages, Ma = 11th c. onwards

5.2 Ubiquity

In Chapter 4 ubiquity has been described as the best way to present the archaeobotanical remains from the Italian peninsula, given the numerous biases in the samples. The heatmap below (Figure 5.2) provides a good overview of the temporal trends of presence of cereals, legumes, fruits and nuts in the entire area under examination.

Show the code
# Load the libraries
# Note: these libraries are used for the data visualizations in this page.
library(RColorBrewer)
library(reshape2)
library(ggplot2)
library(hrbrthemes)
library(plotly)
library(patchwork)

## UBIQUITY

## Creating a dataframe that contains the ubiquity of each century under examination. 
Ubiquity_table <- data.frame(
  "I BCE" = archaeobotany_tables(plants_export, -1)$Ubiquity_exp,  
  "I CE" = archaeobotany_tables(plants_export, 1)$Ubiquity_exp,
  "II CE" = archaeobotany_tables(plants_export, 2)$Ubiquity_exp,
  "III CE" = archaeobotany_tables(plants_export, 3)$Ubiquity_exp,
  "IV CE" = archaeobotany_tables(plants_export, 4)$Ubiquity_exp,
  "V CE" = archaeobotany_tables(plants_export, 5)$Ubiquity_exp,
  "VI CE" = archaeobotany_tables(plants_export, 6)$Ubiquity_exp,
  "VII CE" = archaeobotany_tables(plants_export, 7)$Ubiquity_exp,
  "VIII CE" = archaeobotany_tables(plants_export, 8)$Ubiquity_exp,
  "IX CE" = archaeobotany_tables(plants_export, 9)$Ubiquity_exp,
  "X CE" = archaeobotany_tables(plants_export, 10)$Ubiquity_exp,
  "XI CE" = archaeobotany_tables(plants_export, 11)$Ubiquity_exp
  )

# Transform the ubiquity table into a matrix
Ubiquity_mat <- as.matrix(Ubiquity_table) 

# Rename the centuries
colnames(Ubiquity_mat) <- c("1st c. BCE", "1st c. CE", "2nd c. CE",
                            "3rd c. CE", "4th c. CE", "5th c. CE",
                            "6th c. CE", "7th c. CE", "8th c. CE",
                            "9th c. CE", "10th c. CE", "11th c. CE") 

# The data has to be molten to use it with ggplot2
# (package: reshape2)
Ubiquity_melt <- melt(Ubiquity_mat)

# Let's now rename the columns 
colnames(Ubiquity_melt) <- c("Taxon", "Century", "Ubiquity")

# Add a column for the text tooltip
Ubiquity_melt <- Ubiquity_melt %>%
  mutate(text = paste0("Taxon: ", Taxon, "\n", "Century: ", Century, "\n", "Value: ",round(Ubiquity,2)))

# Create the heatmap with ggplot2
Ubiquity_HM <- ggplot(Ubiquity_melt, aes(Century, Taxon, fill=Ubiquity, text=text)) + 
  geom_tile(colour="white") +
  scale_alpha(range=c(0,1)) +
  scale_x_discrete("", expand = c(0, 0)) + 
  scale_y_discrete("", expand = c(0, 0)) + 
  theme_grey(base_size = 9) + 
  theme(legend.position = "right",
        axis.ticks = element_blank(), 
        axis.text.x = element_text(angle = 90, hjust = 0)
        ) +
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
  labs(
    title="Ubiquity",
    subtitle="Diachronical heatmap of recorded plant species"
  ) +
  scale_fill_gradient(low = "white", high = "black")

Figure 5.2: Diachronical heatmap of recorded plant species

5.2.1 Macroregional differences

The heatmap displayed in Figure 5.2 presents diachronical ubiquity values of the entire peninsula. However, it is also possible to look at the macroregional differences in plants ubiquities. The R function Ubiquity_macroreg_chrono() (Section 1.3) was created to subset data related to (current) Northern, Central and Southern Italian regions. Subsetting the dataset required a larger chronological division to obtain enough sites for a statistical interpretation of the results. The ubiquity values are presented using the variable Chronology rather than the individual centuries. For a clearer reading of the plot, the taxa have been divided into–Cereals, Pulses and Fruits/Nuts. Some taxa have been omitted from the plot.

Source code:

Show the code: data preparation
# Ubiquity by Italian Macro regions: Northern, Central and Southern Italy

# Load the libraries
library(vegan)
library(matrixStats)
library(patchwork)

# Creating a dataframe with the ubiquities of all macroregions and chronologies
bot_macroreg <- rbind(
  Ubiquity_R_NI <- Ubiquity_macroreg_chrono(Archaeobot_Condensed,"Northern Italy", "R"),
  Ubiquity_R_CI <- Ubiquity_macroreg_chrono(Archaeobot_Condensed,"Central Italy", "R"),
  Ubiquity_R_SI <- Ubiquity_macroreg_chrono(Archaeobot_Condensed,"Southern Italy", "R"),
  Ubiquity_LR_NI <- Ubiquity_macroreg_chrono(Archaeobot_Condensed,"Northern Italy", "LR"),
  Ubiquity_LR_CI <- Ubiquity_macroreg_chrono(Archaeobot_Condensed,"Central Italy", "LR"),
  Ubiquity_LR_SI <- Ubiquity_macroreg_chrono(Archaeobot_Condensed,"Southern Italy", "LR"),
  Ubiquity_EMA_NI <- Ubiquity_macroreg_chrono(Archaeobot_Condensed,"Northern Italy", "EMA"),
  Ubiquity_EMA_CI <- Ubiquity_macroreg_chrono(Archaeobot_Condensed,"Central Italy", "EMA"),
  Ubiquity_EMA_SI <- Ubiquity_macroreg_chrono(Archaeobot_Condensed,"Southern Italy", "EMA"),
  Ubiquity_Ma_NI <- Ubiquity_macroreg_chrono(Archaeobot_Condensed,"Northern Italy", "Ma"),
  Ubiquity_Ma_CI <- Ubiquity_macroreg_chrono(Archaeobot_Condensed,"Central Italy", "Ma"),
  Ubiquity_Ma_SI <- Ubiquity_macroreg_chrono(Archaeobot_Condensed,"Southern Italy", "Ma")
)

# Re-arranging the cereals/macroregions for visualisation on the Y axis
level_macroreg_order <- c("Southern Italy", "Central Italy", "Northern Italy")

level_cereals_order <- c("Common.Wheat", "Barley", "Rye", 
                         "Einkorn", "Emmer", "Proso.millet", 
                         "Foxtail.millet", "Oats", "Sorghum")

# Cereals
cer_ubiquity_macroreg.R <- filter(bot_macroreg, Chronology=="R" & Plant.Type=="Cereals")
cer_ubiquity_macroreg.R <- filter(cer_ubiquity_macroreg.R, Macroregion!="Central Italy")
cer_ubiquity_macroreg.LR <- filter(bot_macroreg, Chronology=="LR" & Plant.Type=="Cereals")
cer_ubiquity_macroreg.EMA <- filter(bot_macroreg, Chronology=="EMA" & Plant.Type=="Cereals")
cer_ubiquity_macroreg.Ma <- filter(bot_macroreg, (Chronology=="Ma" & Plant.Type=="Cereals"))
cer_ubiquity_macroreg.Ma <- filter(cer_ubiquity_macroreg.Ma, Macroregion!="Southern Italy")

#Pulses
puls_ubiquity_macroreg.R <- filter(bot_macroreg, Chronology=="R" & Plant.Type=="Pulses")
puls_ubiquity_macroreg.R <- filter(puls_ubiquity_macroreg.R, Macroregion!="Central Italy")
puls_ubiquity_macroreg.R <- filter(puls_ubiquity_macroreg.R, Plant!="Chickpea")
puls_ubiquity_macroreg.LR <- filter(bot_macroreg, Chronology=="LR" & Plant.Type=="Pulses")
puls_ubiquity_macroreg.LR <- filter(puls_ubiquity_macroreg.LR, 
                                    Macroregion!="Southern Italy")
puls_ubiquity_macroreg.LR <- filter(puls_ubiquity_macroreg.LR, Plant!="Chickpea")
puls_ubiquity_macroreg.EMA <- filter(bot_macroreg, Chronology=="EMA" & Plant.Type=="Pulses")
puls_ubiquity_macroreg.Ma <- filter(bot_macroreg, Chronology=="Ma"  & Plant.Type=="Pulses")
puls_ubiquity_macroreg.Ma <- filter(puls_ubiquity_macroreg.Ma, 
                                    Macroregion!="Southern Italy")

#Fruits (+ Subset)
fnuts_ubiquity_macroreg.R <- filter(bot_macroreg, Chronology=="R" & Plant.Type=="Fruits/Nuts")
fnuts_ubiquity_macroreg.R <- subset(fnuts_ubiquity_macroreg.R, (Plant == "Wild.Cherry" | Plant == "Walnut" | Plant == "Peach" | Plant == "Olive" |Plant == "Grape" | Plant =="Fig" | Plant =="Apple"))
fnuts_ubiquity_macroreg.R <- filter(fnuts_ubiquity_macroreg.R, Macroregion!="Central Italy")
fnuts_ubiquity_macroreg.LR <- filter(bot_macroreg, Chronology=="LR" & Plant.Type=="Fruits/Nuts")
fnuts_ubiquity_macroreg.LR <- subset(fnuts_ubiquity_macroreg.LR, (Plant == "Wild.Cherry" | Plant == "Walnut" | Plant == "Peach" | Plant == "Olive" |Plant == "Grape" | Plant =="Fig" | Plant =="Apple"))
fnuts_ubiquity_macroreg.EMA <- filter(bot_macroreg, Chronology=="EMA" & Plant.Type=="Fruits/Nuts")
fnuts_ubiquity_macroreg.EMA <- subset(fnuts_ubiquity_macroreg.EMA, (Plant == "Wild.Cherry" | Plant == "Walnut" | Plant == "Peach" | Plant == "Olive" |Plant == "Grape" | Plant =="Fig" | Plant =="Apple"))
fnuts_ubiquity_macroreg.Ma <- filter(bot_macroreg, Chronology=="Ma"  & Plant.Type=="Fruits/Nuts")
fnuts_ubiquity_macroreg.Ma <- subset(fnuts_ubiquity_macroreg.Ma, (Plant == "Wild.Cherry" | Plant == "Walnut" | Plant == "Peach" | Plant == "Olive" |Plant == "Grape" | Plant =="Fig" | Plant =="Apple"))
fnuts_ubiquity_macroreg.Ma <- filter(fnuts_ubiquity_macroreg.Ma, Macroregion!="Southern Italy")
Show the code: plots
# Cereals plots ubiquity
cer_ubiquity_macroreg_R.HM <- ggplot(cer_ubiquity_macroreg.R, aes(
  factor(Macroregion, levels=(level_macroreg_order)),
  factor(Plant, levels=rev(level_cereals_order)),
  fill=(Ubiquity)
)) + 
  geom_tile(colour="white") +
  geom_text(aes(label = Ubiquity), colour="white", size=3)+ 
  scale_alpha(range=c(0,1)) +
  scale_x_discrete("", expand = c(0, 0)) + 
  scale_y_discrete("", expand = c(0, 0)) + 
  theme_grey(base_size = 9) + 
  theme(legend.position = "none",
        axis.ticks = element_blank()
  ) +
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
  labs(
    title="Roman"
  ) +  scale_fill_gradient(low = "white", high = "black")


cer_ubiquity_macroreg_LR.HM <- ggplot(cer_ubiquity_macroreg.LR, aes(
  factor(Macroregion, levels=(level_macroreg_order)),
  factor(Plant, levels=rev(level_cereals_order)),
  fill=(Ubiquity)
)) + 
  geom_tile(colour="white") +
  geom_text(aes(label = Ubiquity), colour="white", size=3)+ 
  scale_alpha(range=c(0,1)) +
  scale_x_discrete("", expand = c(0, 0)) + 
  scale_y_discrete("", expand = c(0, 0)) + 
  theme_grey(base_size = 9) + 
  theme(legend.position = "none",
        axis.ticks = element_blank()
  ) +
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
  labs(
    title="Late Roman"
  ) +  scale_fill_gradient(low = "white", high = "black")

cer_ubiquity_macroreg_EMA.HM <- ggplot(cer_ubiquity_macroreg.EMA, aes(
  factor(Macroregion, levels=(level_macroreg_order)),
  factor(Plant, levels=rev(level_cereals_order)),
  fill=(Ubiquity)
)) + 
  geom_tile(colour="white") +
  geom_text(aes(label = Ubiquity), colour="white", size=3)+ 
  scale_alpha(range=c(0,1)) +
  scale_x_discrete("", expand = c(0, 0)) + 
  scale_y_discrete("", expand = c(0, 0)) + 
  theme_grey(base_size = 9) + 
  theme(legend.position = "none",
        axis.ticks = element_blank()
  ) +
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
  labs(
    title="Early Medieval"
  ) +  scale_fill_gradient(low = "white", high = "black")

cer_ubiquity_macroreg_Ma.HM <- ggplot(cer_ubiquity_macroreg.Ma, aes(
  factor(Macroregion, levels=(level_macroreg_order)),
  factor(Plant, levels=rev(level_cereals_order)),
  fill=(Ubiquity)
)) + 
  geom_tile(colour="white") +
  geom_text(aes(label = Ubiquity), colour="white", size=3)+ 
  scale_alpha(range=c(0,1)) +
  scale_x_discrete("", expand = c(0, 0)) + 
  scale_y_discrete("", expand = c(0, 0)) + 
  theme_grey(base_size = 9) + 
  theme(legend.position = "none",
        axis.ticks = element_blank()
  ) +
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
  labs(
    title="Medieval"
  ) +  scale_fill_gradient(low = "white", high = "black")


Cereals_Ubiquity_MacroReg_Patchwork <- (cer_ubiquity_macroreg_R.HM|cer_ubiquity_macroreg_LR.HM)/(cer_ubiquity_macroreg_EMA.HM|cer_ubiquity_macroreg_Ma.HM)
Cereals_Ubiquity_MacroReg_Patchwork + plot_annotation(
  title = 'Cereals',
  subtitle = 'Ubiquity (%), plotted by macroregion and chronology.',
  caption='Note: Data was too scarce for Roman Central Italy and Medieval Southern Italy.'
)
Show the code: plots
# Pulses plots ubiquity
puls_ubiquity_macroreg_R.HM <- ggplot(puls_ubiquity_macroreg.R, aes(
  factor(Macroregion, levels=(level_macroreg_order)),
  Plant,
  fill=(Ubiquity)
)) + 
  geom_tile(colour="white") +
  geom_text(aes(label = Ubiquity), colour="#cfcfcf", size=3)+ 
  scale_alpha(range=c(0,1)) +
  scale_x_discrete("", expand = c(0, 0)) + 
  scale_y_discrete("", expand = c(0, 0)) + 
  theme_grey(base_size = 9) + 
  theme(legend.position = "none",
        axis.ticks = element_blank()
  ) +
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
  labs(
    title="Roman"
  ) +  scale_fill_gradient(low = "white", high = "black")


puls_ubiquity_macroreg_LR.HM <- ggplot(puls_ubiquity_macroreg.LR, aes(
  factor(Macroregion, levels=(level_macroreg_order)),
  Plant,
  fill=(Ubiquity)
)) + 
  geom_tile(colour="white") +
  geom_text(aes(label = Ubiquity), colour="#ffffff", size=3)+ 
  scale_alpha(range=c(0,1)) +
  scale_x_discrete("", expand = c(0, 0)) + 
  scale_y_discrete("", expand = c(0, 0)) + 
  theme_grey(base_size = 9) + 
  theme(legend.position = "none",
        axis.ticks = element_blank()
  ) +
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
  labs(
    title="Late Roman"
  ) +  scale_fill_gradient(low = "white", high = "black")

puls_ubiquity_macroreg_EMA.HM <- ggplot(puls_ubiquity_macroreg.EMA, aes(
  factor(Macroregion, levels=(level_macroreg_order)),
  Plant,
  fill=(Ubiquity)
)) + 
  geom_tile(colour="white") +
  geom_text(aes(label = Ubiquity), colour="white", size=3)+ 
  scale_alpha(range=c(0,1)) +
  scale_x_discrete("", expand = c(0, 0)) + 
  scale_y_discrete("", expand = c(0, 0)) + 
  theme_grey(base_size = 9) + 
  theme(legend.position = "none",
        axis.ticks = element_blank()
  ) +
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
  labs(
    title="Early Medieval"
  ) +  scale_fill_gradient(low = "white", high = "black")

puls_ubiquity_macroreg_Ma.HM <- ggplot(puls_ubiquity_macroreg.Ma, aes(
  factor(Macroregion, levels=(level_macroreg_order)),
  Plant,
  fill=(Ubiquity)
)) + 
  geom_tile(colour="white") +
  geom_text(aes(label = Ubiquity), colour="white", size=3)+ 
  scale_alpha(range=c(0,1)) +
  scale_x_discrete("", expand = c(0, 0)) + 
  scale_y_discrete("", expand = c(0, 0)) + 
  theme_grey(base_size = 9) + 
  theme(legend.position = "none",
        axis.ticks = element_blank()
  ) +
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
  labs(
    title="Medieval"
  ) +  scale_fill_gradient(low = "white", high = "black")


Pulses_Ubiquity_MacroReg_Patchwork <- (puls_ubiquity_macroreg_R.HM|puls_ubiquity_macroreg_LR.HM)/(puls_ubiquity_macroreg_EMA.HM|puls_ubiquity_macroreg_Ma.HM)
Pulses_Ubiquity_MacroReg_Patchwork + plot_annotation(
  title = 'Pulses',
  subtitle = 'Ubiquity (%), plotted by macroregion and chronology.',
  caption='Note: Data was too scarce for Roman Central Italy and Late Roman/Medieval Southern Italy.'
)
Show the code: plots
# Fruits nuts plots

fnuts_ubiquity_macroreg_R.HM <- ggplot(fnuts_ubiquity_macroreg.R, aes(
  factor(Macroregion, levels=(level_macroreg_order)),
  Plant,
  fill=(Ubiquity)
)) + 
  geom_tile(colour="white") +
  geom_text(aes(label = Ubiquity), colour="white", size=3)+ 
  scale_alpha(range=c(0,1)) +
  scale_x_discrete("", expand = c(0, 0)) + 
  scale_y_discrete("", expand = c(0, 0)) + 
  theme_grey(base_size = 9) + 
  theme(legend.position = "none",
        axis.ticks = element_blank()
  ) +
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
  labs(
    title="Roman"
  ) +  scale_fill_gradient(low = "white", high = "black")


fnuts_ubiquity_macroreg_LR.HM <- ggplot(fnuts_ubiquity_macroreg.LR, aes(
  factor(Macroregion, levels=(level_macroreg_order)),
  Plant,
  fill=(Ubiquity)
)) + 
  geom_tile(colour="white") +
  geom_text(aes(label = Ubiquity), colour="white", size=3)+ 
  scale_alpha(range=c(0,1)) +
  scale_x_discrete("", expand = c(0, 0)) + 
  scale_y_discrete("", expand = c(0, 0)) + 
  theme_grey(base_size = 9) + 
  theme(legend.position = "none",
        axis.ticks = element_blank()
  ) +
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
  labs(
    title="Late Roman"
  ) +  scale_fill_gradient(low = "white", high = "black")

fnuts_ubiquity_macroreg_EMA.HM <- ggplot(fnuts_ubiquity_macroreg.EMA, aes(
  factor(Macroregion, levels=(level_macroreg_order)),
  Plant,
  fill=(Ubiquity)
)) + 
  geom_tile(colour="white") +
  geom_text(aes(label = Ubiquity), colour="white", size=3)+ 
  scale_alpha(range=c(0,1)) +
  scale_x_discrete("", expand = c(0, 0)) + 
  scale_y_discrete("", expand = c(0, 0)) + 
  theme_grey(base_size = 9) + 
  theme(legend.position = "none",
        axis.ticks = element_blank()
  ) +
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
  labs(
    title="Early Medieval"
  ) +  scale_fill_gradient(low = "white", high = "black")

fnuts_ubiquity_macroreg_Ma.HM <- ggplot(fnuts_ubiquity_macroreg.Ma, aes(
  factor(Macroregion, levels=(level_macroreg_order)),
  Plant,
  fill=(Ubiquity)
)) + 
  geom_tile(colour="white") +
  geom_text(aes(label = Ubiquity), colour="white", size=3)+ 
  scale_alpha(range=c(0,1)) +
  scale_x_discrete("", expand = c(0, 0)) + 
  scale_y_discrete("", expand = c(0, 0)) + 
  theme_grey(base_size = 9) + 
  theme(legend.position = "none",
        axis.ticks = element_blank()
  ) +
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
  labs(
    title="Medieval"
  ) +  scale_fill_gradient(low = "white", high = "black")


FrNuts_Ubiquity_MacroReg_Patchwork <- (fnuts_ubiquity_macroreg_R.HM|fnuts_ubiquity_macroreg_LR.HM)/(fnuts_ubiquity_macroreg_EMA.HM|fnuts_ubiquity_macroreg_Ma.HM)
FrNuts_Ubiquity_MacroReg_Patchwork + plot_annotation(
  title = 'Fruits/Nuts',
  subtitle = 'Ubiquity (%), plotted by macroregion and chronology.',
  caption='Note: Data was too scarce for Roman Central Italy and Medieval Southern Italy.'
)

5.2.1.1 Cereals

It is interesting to notice how in the Roman age, cereals are similarly ubiquitous in Southern and Northern Italy, although there are some exceptions (i.e. einkorn, rye, oats, proso millet) that can derive from the randomness of samples. Unfortunately, only three sites provided botanical samples for Roman Central Italy and the values have been omitted from the plot. These sites (from the Roman Peasant Project, Tuscany) only reported three kinds of cereal: common wheat, emmer, and barley. Similar ubiquity values for the two macroregions under assessment in the Roman age may suggest similar production patterns in the whole peninsula. In the Late Roman age, ubiquity data has been calculated for the three macroregions. Three crops are found on 62-75% of the Central Italian sites: common wheat, barley and emmer. Other cereals are present, but less ubiquitously. These three cultivations seem to be diffused in the south as well. Conversely, in Northern Italy common wheat and barley were important cultivations but competed with other cereals including millet, sorghum, and rye (now doubled in presence). The Early Medieval age seems to mark a shift in agricultural practices—cereals ubiquities vary more markedly in the three macroregions. In Southern Italy, common wheat and barley were still the predominant cereals. This is true for Central and Northern Italy, however in these regions other cereals are also widely present in a large number of sites. The samples from the Medieval age are fewer in number since the upper boundary of this project’s chronology is the 11th c. Despite the short chronology, it is possible to make some considerations. Medieval Centraly Italy relied heavily on common wheat, barley and emmer, with other cereals increasingly important. Barley is the most ubiquitous cereal in Northern Italy in this period, followed by common wheat, millets and sorghum.

Figure 5.3: Diachronical heatmap of cereals in the Italian macroregions

5.2.1.2 Pulses

In the Roman Age, pulses are an important part of the diet and are cultivated both in Northern and Southern Italy. In the latter, vetch/broad beans are present in 22-32% of the samples, and lentils are present in 38% of the sites. In the Late Roman Age, broad beans are equally important in Central and Northern Italy, and peas are present in 50% of the Central Italian sites. In the Early Medieval Age, pulses are present in many Central Italian sites, especially blue/red peas, broad beans and other Fabaceae. Lentils and broad beans are also cultivated in almost half of the Northern Italian sites. The importance of pulses in Central Italy is confirmed by the 11th c. samples, where every specie is present in over 66% of the sites and Fabaceae and blue/red peas are found in every sample. Conversely, in Northern Italy broad bean is found in 66% of the sites.

Figure 5.4: Diachronical heatmap of pulses in the Italian macroregions

5.2.1.3 Fruits and nuts

Olive and grape are two essential cultivations in the Italian peninsula. Olive pits, as can be expected, are more ubiquitous in Southern Italy, where in Roman times are present in >87% of the sites and in over 58% of the sites in the following chronologies1. Conversely, the grape is important in Central and Northern Italy in the Late Roman, Early Medieval and Medieval ages.

Figure 5.5: Diachronical heatmap of fruits/nuts in the Italian macroregions

5.3 Richness and diversity

Section in progress

5.3.1 Richness and diversity in the Italian macroregions

Show the code: data preparation
# Species richness based on geographical features
# RELATIVE PROPORTIONS OF ARCHAEOBOT_VIZ QUERY EXPORT FROM THE DB
# (Condensed, without totals)

# Remove NAs
Df_Cond_Plants[is.na(Df_Cond_Plants)] <-0 

# Generate a dataframe with the relative proportions and round the results
Df_Cond_Plants_Rel <- decostand(Df_Cond_Plants[11:50], method = "total")
Df_Cond_Plants_Rel <- round(Df_Cond_Plants_Rel, digits=2)

# Add more info to the dataframe
Df_Cond_Plants_Rel_Richness_Diversity <- data.frame(
                                 "Geo" = Df_Cond_Plants$Geo,
                                 "Chronology" = Df_Cond_Plants$Chronology, 
                                 "Type"= Df_Cond_Plants$Type, 
                                 "Specnumber" = specnumber(Df_Cond_Plants_Rel),
                                 "Shannon Div" = diversity(Df_Cond_Plants_Rel),
                                 Df_Cond_Plants_Rel
)

Df_Cond_Plants_Rel_withMacroregion <- data.frame("Geo" = Df_Cond_Plants$Geo,
                                                 "Chronology" = Df_Cond_Plants$Chronology,
                                                 "Type"= Df_Cond_Plants$Type, 
                                                 "Macroregion" = Df_Cond_Plants$name_macroreg,
                                                 "Specnumber" = specnumber(Df_Cond_Plants_Rel[1:10]), #Only cereals
                                                 "Shannon Div" = diversity(Df_Cond_Plants_Rel[1:10]),
                                                 Df_Cond_Plants_Rel[1:10]
)

# Let's plot the diversity by macroregion

# Creating the dataframes for R and EMA age 
# I know it's called "Plants" but it's actually just cereals
Df_Cond_Plants_Rel_withMacroregion.R <- filter(Df_Cond_Plants_Rel_withMacroregion, Chronology == "R")
Df_Cond_Plants_Rel_withMacroregion.LR <- filter(Df_Cond_Plants_Rel_withMacroregion, Chronology == "LR")
Df_Cond_Plants_Rel_withMacroregion.EMA <- filter(Df_Cond_Plants_Rel_withMacroregion, Chronology == "EMA")
Show the code: plots
pal_RichnessvsGeo <- c("cadetblue3", "gold1",  "bisque4", "palegreen4")

plot_RichnessMacroReg.R <- ggplot(Df_Cond_Plants_Rel_withMacroregion.R, aes(x = Macroregion, y = Specnumber, fill = Macroregion)) +
  geom_violin(trim=FALSE) + 
  geom_boxplot(width=0.1, fill="white")+
  scale_fill_manual(values = pal_RichnessvsGeo) +
  geom_jitter(alpha=0.3)+
  scale_x_discrete(labels = c("Central Italy \n (n = 3)", "Northern Italy \n (n = 39)", "Southern Italy \n (n=31)")) +
  theme(legend.position = "none",
        plot.background = element_rect("white"),
        panel.background = element_rect("white"),
        panel.grid = element_line("grey90"),
        axis.line = element_line("gray25"),
        axis.text = element_text(size = 12, color = "gray25"),
        axis.title = element_text(color = "gray25"),
        legend.text = element_text(size = 12)) + 
  labs(x = "Macroregion",
       y = "Number of species per site",
       title = "R - Cereal richness")

plot_RichnessMacroReg.EMA <- ggplot(Df_Cond_Plants_Rel_withMacroregion.EMA, aes(x = Macroregion, y = Specnumber, fill = Macroregion)) +
  geom_violin(trim=FALSE) + 
  geom_boxplot(width=0.1, fill="white")+
  scale_fill_manual(values = pal_RichnessvsGeo) +
  geom_jitter(alpha=0.3)+
  scale_x_discrete(labels = c("Central Italy \n (n = 10)", "Northern Italy \n (n = 36)", "Southern Italy \n (n=17)")) +
  theme(legend.position = "none",
        plot.background = element_rect("white"),
        panel.background = element_rect("white"),
        panel.grid = element_line("grey90"),
        axis.line = element_line("gray25"),
        axis.text = element_text(size = 12, color = "gray25"),
        axis.title = element_text(color = "gray25"),
        legend.text = element_text(size = 12)) + 
  labs(x = "Macroregion",
       y = "Number of species per site",
       title = "EMA - Cereal richness")

Cereals share similar presence values in Roman Northern and Southern Italian sites (Figure 5.6). Central Italy reports higher values, although this is based only on three sites and hence it is not reliable. During the Early Middle Ages, Central Italy again is the richest in cereals, closely followed by Northern Italy. Interestingly, Southern Italy still reports values very close to the Roman age. A full list of the Southern Italian EMA sites is reported in Table 5.1.

(a) Roman age.

(b) Early Medieval age.

Figure 5.6: Violin plots of cereal richness in the Italian macroregions. The grey dots (jitters) indicate the value for the single site, while the white boxplot shows the median and the quartile values.

Table 5.1: List of Southern Italian sites with chronology EMA
ID Site Region Geography Type Culture/Influence
98 S. Maria in Cività, D85 Molise Hilltop Urban Lombard
107 S. Giovanni di Ruoti, Phase 3A Basilicata Mountain Monastery Lombard
107 S. Giovanni di Ruoti, Phase 3B Basilicata Mountain Monastery Lombard
198 Salapia, area botteghe, US 2475 Puglia Coast/Lagoon Urban Lombard
198 Salapia, area botteghe, US 2437 Puglia Coast/Lagoon Urban Lombard
199 Salapia, area conceria, US 2054 Puglia Coast/Lagoon Urban Lombard
199 Salapia, area conceria, US 2211-2217 Puglia Coast/Lagoon Urban Lombard
199 Salapia, area conceria, 8th-9th c. Puglia Coast/Lagoon Urban Lombard
196 Faragola, wastepit 61 Puglia Plain Rural, villa Lombard
196 Faragola, wastepit 66 Puglia Plain Rural, villa Lombard
234 Colle Castellano, Phase 3-4 Molise Hill Urban Lombard
177 San Vincenzo al Volturno, kitchen area Molise Hill Monastery Lombard
101 Supersano, loc. Scorpo Puglia Plain Rural Byzantine
250 Apigliano, 9th-10th c., pits Puglia Plain Rural Byzantine
250 Apigliano, 10th-11th c., pits Puglia Plain Rural Byzantine
196 Faragola, granary A7 Puglia Plain Rural, villa Lombard
196 Faragola, granary A8 Puglia Plain Rural, villa Lombard

5.4 Cereals regionality

5.4.1 PERMANOVA

Notes on terminology: PERMANOVA

Permutational multivariate analysis of variance (PERMANOVA) is a non-parametric multivariate statistical test used to compare group of objects. By using measure space, the null hypothesis that the centroids and dispersion of groups are identical is tested. The null hypothesis is rejected if either the centroid or the spread of the objects differs between the groups. A prior calculation of the distance between any two objects included in the experiment is used to determine whether the test is valid or not2 (Anderson (2017)). In this context, the null hypothesis is that there is no regional difference in the cereals dataset, with cereals being evenly distributed across macroregions and chronologies.

The suggestion of an Early Medieval shift in cereal farming stated in Section 5.2.1 and Section 5.3.1 needs statistical support. Considering that data is not unimodal and that we are dealing with presence/absence analysis, the best choice is to use a non-parametric test as PERMANOVA on the early medieval botanical dataset. Prior to performing the test, it was necessary to pre-process data by:

  • Selecting all the cereals columns of the plant remains table, keeping some categorical variables: Macroregion, Chronology, Geography and Type.

  • Removing the empty rows (caused by the fact that some sites have seeds/fruits, but not cereals).

  • Transforming the raw counts into presence/absence, using the function decostand() (method=pa) in the R package vegan (Oksanen et al. (2020)).

Show the code: Pre-processing
# Testing the results: Regionality in the dataset? 
library(vegan)

set.seed(29)


# Pre processing: remove empty rows

# Note: The input table is the CONDENSED table without totals

# Selecting all the cereals columns of the plant remains table, keeping some categorical variables 
cer_macroreg_ubiquity_transp.tot <- Df_Cond_Plants[c(4,5,6,7,11:19)]

# Selecting all rows with data (since we selected only with cereals, and some sites only had fruits/pulses we might have empty rows)
cer_macroreg_ubiquity_transp.tot <- cer_macroreg_ubiquity_transp.tot[rowSums(cer_macroreg_ubiquity_transp.tot[5:13])>0,]

# Assigning a column name "Macroregion"
colnames(cer_macroreg_ubiquity_transp.tot)[1] = "Macroregion"

# Selecting the Chronology of interest (EMA) and excluding Central Italy
cer_macroreg_ubiquity_transp.tot<- filter(cer_macroreg_ubiquity_transp.tot, Macroregion!="Central Italy" & Chronology=="EMA")

# dividing categorical and numerical columns
cer_macroreg_ubiquity_transp.categ <- cer_macroreg_ubiquity_transp.tot[1:4]
cer_macroreg_ubiquity_transp.data <- cer_macroreg_ubiquity_transp.tot[5:13]

# Converting the numerical columns into a presence/absence matrix (using method=pa)
cer_macroreg_ubiquity_transp.dist <- decostand(cer_macroreg_ubiquity_transp.data, method="pa", na.rm=TRUE)

After the pre-processing, it was possible to run the PERMANOVA using the function adonis2() in the package vegan. The function creates a distance matrix and computes an analysis of variance on the matrix. The method chosen to calculate the distance matrix is the jaccard distance. The Jaccard distance (Kosub (2019)), based on the Jaccard similarity index, is a value of dissimilarity between sample sets. When compared to other dissimilarity indices, it is more appropriate for presence/absence analyses as it is not based on Euclidean distance.

Results of adonis2().

Show the code: adonis2()
cer_macroreg_ubiquity_transp.div <- adonis2(
  cer_macroreg_ubiquity_transp.dist ~ Macroregion, 
  data = cer_macroreg_ubiquity_transp.categ,
  permutations = 10000, method="jaccard"
  )
Permutation test for adonis under reduced model
Terms added sequentially (first to last)
Permutation: free
Number of permutations: 10000

adonis2(formula = cer_macroreg_ubiquity_transp.dist ~ Macroregion, data = cer_macroreg_ubiquity_transp.categ, permutations = 10000, method = "jaccard")
            Df SumOfSqs      R2      F    Pr(>F)    
Macroregion  1   1.3024 0.13495 7.3322 9.999e-05 ***
Residual    47   8.3487 0.86505                     
Total       48   9.6512 1.00000                     
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The results of the PERMANOVA indicate that the variable Macroregion is highly significant, meaning that we can be 99.9% confident that it is a discriminant in the early medieval dataset.

After running the PERMANOVA, it is necessary to check the homogeneity of variances, to confirm the results (especially when dealing with small groups of data). The function betadisper() from the package vegan provides the distances of group samples from centroids. If the variation is even, the null hypothesis of no difference in dispersion between groups is accepted. To test the variation, it is possible to use the analysis of variance (ANOVA).

Show the code: Betadisper()
# We do not need to calculate the distance separately, but it will be useful later for the betadisper() function

# Distance dissimilarity matrix with the Jaccard method
cer_macroreg_ubiquity_transp.dist2 <- vegdist(cer_macroreg_ubiquity_transp.dist, method="jaccard", na.rm=TRUE)

# Betadisper: distances of group samples from centroids
cer_macroreg_ubiquity_transp.betadisper <- betadisper(cer_macroreg_ubiquity_transp.dist2, cer_macroreg_ubiquity_transp.categ$Macroregion)

Results of anova() on the betadisper.

Show the code: ANOVA on betadisper()
# We will see that the ANOVA's p-value is not significant meaning that group dispersions are homogenous 
#("Null hypothesis of no difference in dispersion between groups"; https://www.rdocumentation.org/packages/vegan/versions/2.4-2/topics/betadisper).

anova(cer_macroreg_ubiquity_transp.betadisper) # This should not be significant!
Analysis of Variance Table

Response: Distances
          Df  Sum Sq   Mean Sq F value Pr(>F)
Groups     1 0.00211 0.0021112  0.0715 0.7903
Residuals 47 1.38696 0.0295098               

(a) Groups dispersions plot with confidence ellipses.

(b) Boxplot showing equal distances from centroid.

Figure 5.7: Results of the betadisper() (groups dispersions) on the distance matrix calculated with the Jaccard method.

The betadisper() graphs (Figure 5.7) show similar distances from the centroids for the categories Northern Italy and Southern Italy. In addition, the ANOVA on the betadisper() shows that the separation is not significant (p-value over the significance treshold), meaning that the groups dispersions are homogeneous. We can now be more confident of the PERMANOVA results and accept the difference between the two groups of sites under investigation. In other words, the Southern and Northern Italian group of sites are different during the Early Middle Ages.

Running the same tests on the Roman sites failed to separate the two groups of sites, confirming that there was not a major difference in the types of cereals cultivated during the Roman age between Northern and Southern Italy.

5.4.2 nMDS

5.4.3 NCA

Notes on terminology: Wasserstein metric

The Wasserstein distance (or earth’s mover distance) is a measure of distance between two probability distributions on a metric space.

In addition to statistically testing the separation between the Northern and Southern Italian early medieval cereals dataset (Section 5.4.1), it is possible to measure the distance between groups of sites both in the Roman and early Middle ages. For this task, a machine learning algorithm for metric learning has been chosen: the Neighborhood Component Analysis, from the Python package NeighborhoodComponentAnalysis (in sklearn.neighbors). A more in-depth explanation of the algorithm can be read in Section 4.6.2.4. To work with balanced group of samples, the group sizes have been arbitrarily set to 20 random samples, allowing replacement (meaning that a sample can randomly be picked twice). The Python function sample() (from the random library) was used to select random samples. To avoid fallacy in computations, the macroregion Central Italy and the chronologies LR (Late Roman) and Ma (11th c. onwards) have been excluded from this test—the uneven distribution of the group of samples required a cautious approach. The NCA has been run with a reduction to only one dimension, using KDE plots to visualize the results. Setting the dimension to one allows easier calculations of distance. In Figure 5.8 (a), it is possible to see the NCA performed on the Roman cereals presence/absence dataset. As already pointed out, the PERMANOVA did not produce significant results for this dataset and the Wasserstein distance (calculated with the wasserstein_distance() function in the scipy library) is indeed shorter for the Roman dataset. For both chronologies there is an overlap in the curves, which is more considerable in the Roman age (indicating that the group of samples are more similar). The overlap for the EMA groups (Figure 5.8, b) is probably due to the fact that the presence of the noble grains is not by itself a ‘marker’ of Southern Italian sites—these grains are also very common in the North. The difference is that in the South noble grains are not cultivated in conjunction with other grains. The graph for the EMA chronology shows a clearer separation of the macroregional groups, with some minor overlaps. Moreover, the graph also displays variability in the Northern Italian dataset. The variability can also be assessed from the outliers in the boxplots in Figure 5.9.

Source code:

Show the code: Libraries
# Load Python libraries
#!pip install pandas
import pandas as pd
import os

#!pip install scikit-hubness
import random
import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.decomposition import PCA
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler

from sklearn.neighbors import (KNeighborsClassifier,
                                 NeighborhoodComponentsAnalysis)

from scipy.stats import wasserstein_distance

import seaborn as sns

# Set seed
random.seed(10)
Show the code: Selecting random samples, with replacement
#R
df_R_SI = df_R[df_R["Macroregion"]=="Southern Italy"].sample(20, random_state=7, replace="TRUE")
df_R_NI = df_R[df_R["Macroregion"]=="Northern Italy"].sample(20, random_state=7, replace="TRUE")
# Create a dataset with northern and southern Italy
df_R_merge = pd.concat([df_R_SI, df_R_NI], ignore_index=True)

data_R_byPlantGroup = df_R_merge.drop(['Chronology','Type', 'Macroregion', 'Weight'], axis=1)
labels_R_byPlantGroup = df_R_merge.iloc[:,2] # nrow, 0 for Chronology - nrow, 1 for Type - nrow,2 for Macroregion

#EMA
df_EMA_SI = df_EMA[df_EMA["Macroregion"]=="Southern Italy"].sample(20, random_state=7, replace="TRUE")
df_EMA_NI = df_EMA[df_EMA["Macroregion"]=="Northern Italy"].sample(20, random_state=7, replace="TRUE")
df_EMA_merge = pd.concat([df_EMA_SI, df_EMA_NI], ignore_index=True)
 
data_EMA_byPlantGroup = df_EMA_merge.drop(['Chronology','Type', 'Macroregion', 'Weight'], axis=1)
labels_EMA_byPlantGroup = df_EMA_merge.iloc[:,2] # nrow, 0 for Chronology - nrow, 1 for Type - nrow,2 for Macroregion
Show the code: Performing the NCA
#R
nca_for_KDE_R_PlantGroup = NeighborhoodComponentsAnalysis(n_components =1, init="lda").fit(data_R_byPlantGroup, labels_R_byPlantGroup)
reduction_for_KDE_R_PlantGroup = nca_for_KDE_R_PlantGroup.transform(data_R_byPlantGroup)
df_R_merge["value"] = reduction_for_KDE_R_PlantGroup
     
#EMA
nca_for_KDE_EMA_PlantGroup = NeighborhoodComponentsAnalysis(n_components =1, init="lda").fit(data_EMA_byPlantGroup, labels_EMA_byPlantGroup)
reduction_for_KDE_EMA_PlantGroup = nca_for_KDE_EMA_PlantGroup.transform(data_EMA_byPlantGroup)
df_EMA_merge["value"] = reduction_for_KDE_EMA_PlantGroup
Show the code: Wasserstein distance
# R
df_R_North = df_R_merge[df_R_merge["Macroregion"] == "Northern Italy"]
df_R_South = df_R_merge[df_R_merge["Macroregion"] == "Southern Italy"]

wasserstein_distance(df_R_South["value"], df_R_North["value"], u_weights=df_R_North["Weight"], v_weights=df_R_South["Weight"])

#EMA

df_EMA_North = df_EMA_merge[df_EMA_merge["Macroregion"] == "Northern Italy"]
df_EMA_South = df_EMA_merge[df_EMA_merge["Macroregion"] == "Southern Italy"]

wasserstein_distance(df_EMA_South["value"], df_EMA_North["value"], u_weights=df_EMA_North["Weight"], v_weights=df_EMA_South["Weight"])

Plots

Show the code: Plotting the NCA
NCA_KDE_1D, ax = plt.subplots(1, 2, figsize=(10, 5), sharey=True, sharex=True)

sns.kdeplot(data=df_R_merge, x="value", ax=ax[0], hue="Macroregion", fill=True, alpha=.1, palette="colorblind", linewidth=1, legend=None).set(xlabel='NCA', title="(a) Roman age")

sns.kdeplot(data=df_EMA_merge, x="value", ax=ax[1], hue="Macroregion", fill=True, alpha=.1, palette="colorblind", linewidth=1).set(xlabel='NCA', title="(b) early Middle ages")

NCA_KDE_1D.text(0.08, 0.03, 'Weighted Wasserstein Distance: W = 67.64 \nPERMANOVA test: p>0.05', fontsize=10)
NCA_KDE_1D.text(0.68, 0.03, 'Weighted Wasserstein Distance: W = 200.65\nPERMANOVA test: p<0.001', fontsize=10)
plt.tight_layout()
plt.subplots_adjust(bottom=0.19)
plt.show()

Figure 5.8: One-Dimension NCA on the Presence/Absence Cereals Dataset

Figure 5.9: Boxplots showing the NCA value for Northern and Southern Early Medieval Italy.

5.4.4 Network Analysis of cereals in EMA sites

Section in progress

  1. The Late Roman values for Southern Italy are only based on 5 samples (3 of which are from the same site, Salapia) so the values are not very trustworthy.↩︎

  2. Source: Wikipedia. Change the source later↩︎